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Abstract 

The  construction  of  elementary  unitary  matrices  that  transform  a  complex 
vector  to  a  multiple  of  ,  the  first  column  of  the  identity  matrix,  are  studied.  We 
present  four  variants  and  their  software  implementation,  including  a  discussion  on 
the  LAPACK  subroutine  CLARFG.  Comparisons  are  also  given. 


1  Introduction 

The  goal  of  this  paper  is  to  survey  elementary  unitary  matrices.  We  begin  by  first 
discussing  elementary  unitary  matrices  that  are  Hermitian.  Let  w  be  a  complex  vector. 
Define  the  elementary  Hermitian  matrix  U  =  /  -  2 wwH ,  where  wHw  =  1.  It  is  easily 
verified  that  U  is  both  Hermitian  and  unitary.  In  particular,  if  w  is  a  real  vector,  then 
U  is  orthogonal  and  symmetric,  and  is  commonly  referred  to  as  a  Householder  reflector. 
Since  U  is  unitary,  its  inverse  is  readily  available. 

Two  important  applications  of  elementary  Hermitians  include  the  computation  of 
the  QR  factorization  of  a  matrix,  and  the  orthogonal  reduction  of  a  square  matrix 
A  into  upper  Hessenberg  form.  The  former  application  is  often  used  for  the  stable 
computation  of  a  solution  for  the  linear  least  squares  problem.  The  latter  application 
is  needed  for  many  eigenvalue  computations.  The  literature  on  elementary  Hermitians 
is  vast.  For  information  on  applications  concerning  Householder  matrices  see  Golub 
and  Van  Loan  [4].  Parlett  [7]  examines  the  algorithmic  and  stability  issues  of  computing 
Householder  matrices.  A  detailed  error  analysis  by  Wilkinson  [10]  shows  the  stability  of 
numerical  techniques  using  elementary  Hermitians.  Besides  these  excellent  numerical 
properties,  their  application  demonstrates  their  efficiency.  If  A  is  a  matrix,  then  U  A  = 
A  -  2w(AH w)H ,  and  hence  explicit  formation  and  storage  of  U  is  not  required.  Only 
the  ability  to  form  the  matrix-vector  product  AH w  and  a  rank  one  update  to  A. 

Fundamental  to  the  use  of  elementary  Hermitians  in  the  above  applications  is  their 
ability  to  transform  a  vector  x  to  a  multiple  of  e1;  the  first  column  of  the  identity 
matrix.  As  we  will  show,  an  elementary  Hermitian  is  not  always  defined  when  x  is  to 
be  transformed  to  a  real  multiple  of  e\.  However,  the  crucial  property  of  unitariness 
may  be  preserved.  The  purpose  of  this  paper  is  to  review  and  examine  the  details  of 
constructing  an  elementary  unitary  matrix  so  that  a  complex  vector  x  is  transformed 
to  a  multiple  of  e\ . 

’Department  of  Computational  and  Applied  Mathematics,  Rice  University.  (lehoucqQrice.edu) 
This  work  was  supported  by  DARPA  under  contract  TV-ORA4466.01 
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The  paper  is  organized  as  follows.  In  §  2  the  mathematical  problem  is  stated  and 
general  conditions  for  constructing  elementary  unitary  matrices  are  derived.  The  four 
approaches  for  construction  are  then  introduced  in  §  2. 1  §  2.4.  The  first  one  is  imple¬ 
mented  in  EISPACK  [8]  and  is  based  upon  a  development  by  Wilkinson  [9,  pages  48-50]. 
The  LINPACK  [2]  approach  is  the  second  one  studied.  The  third  approach  is  due  to 
Hammarling  and  Du  Croz.  It  is  implemented  in  the  NAG  Fortran  Library  subrou¬ 
tine  F06HRF  [6].  The  final  variation  is  implemented  by  the  LAPACK  [1]  subroutine 
CLARFG.  The  details  of  this  software  implementation  are  also  discussed.  Section  three 
is  a  comparison  and  summary  of  our  findings.  In  fact,  our  attempt  to  understand  the 
differences  between  the  Wilkinson  approach  and  the  alternate  formulation  implemented 
by  LAPACK  led  to  this  study. 

We  employ  Householder  notational  conventions.  Capital  and  lower  case  letters 
denote  matrices  and  vectors,  respectively,  while  lower  case  Greek  letters  denote  scalars. 
In  particular,  £,•  =  ej x  denotes  the  i-th  element  of  the  vector  x.  Unless  otherwise  stated, 
all  quantities  are  assumed  to  be  complex  and  i  =  \/—l.  The  real  and  imaginary  part  of 
a  complex  number  a  are  denoted  by  Re(a)  and  Im(a),  respectively.  The  vector  norm 
used  is  the  Euclidean  one:  ||a:||  =  V xH x.  The  reader  is  also  reminded  that  |a|2  =  cia 
where  a  is  the  complex  conjugate  of  a. 

2  Elementary  Unitary  matrices 

Let  us  clearly  state  the  problem  at  hand.  Find  an  elementary  unitary  matrix  that 
satisfies  the  following  three  conditions: 

U  =  I  -  <twwh ,  (JH x  =  j\\x\\e\,  |q|  =  1,  (1) 

where  x  is  a  vector  with  n  components.  The  third  condition  is  a  consequence  of 
the  second  one  since  ||f///x||/||x||  =  |-y | .  The  second  condition  gives  that  xHUHx  = 
7||*||a;wei  implying  that  U  is  an  elementary  Hermitian  matrix  if  and  only  if  a  and 
')xHe\  are  real. 

The  matrix  U  as  defined  by  (1)  is  a  special  member  of  the  more  general  class  of 
elementary  matrices  defined  by 

E(w,v;cr)  =  I  -  <jwvh .  (2) 

See  Householder  [5]  and  Wilkinson  [11]  for  introductions.  Dubrulle  [3]  presents  a  com¬ 
prehensive  study  for  the  case  of  real  w,v  and  <7,  that  includes  a  discussion  to  block 
implementations. 

Let  us  determine  general  conditions  for  an  elementary  matrix  to  be  unitary.  Since 
E(w,v,cr)  must  be  unitary, 

/  =  (I  —  awvH)H  (I  —  awvH)  =  /  —  dvwH  —  awvH  +  aa(wH  w)vvH . 

Cancelling  terms  results  in 

acr(wHw)vvH  —  avwH  +  awvH .  (3) 

Rearranging  terms  gives  (aa(wHw)v  -  crw)vH  =  avwH ,  and  a  row  space  argument 
implies  that  w  and  v  are  linearly  dependent.  Substituting  v  =  w  into  (3)  gives 

\cr\2\\w\\2  =  a  +  a  =  2Re(a)  (4) 


2 


as  the  required  relationship  between  a  and  w.  Note  that  the  above  relationship  contains 
some  redundancy.  Scaling  to  by  a  complex  number  rj  and  dividing  a  by  |7y|2  still 
satisfy  the  relationship.  This  scaling  also  satisfies  the  second  condition  of  (1)  since 
(a\rj\~2)(wr])(wr))H  =  awwH .  Finally,  the  second  condition  of  (1)  gives  that  w  is  a 
linear  combination  of  x  and  ej. 

Four  sets  choices  for  w,  a  and  7  are  the  subject  of  the  §  2. 1-2.2.  A  standard 
modification  for  w  =  fix  -f  ve\  is  that  1  and  v  share  the  same  sign.  In  floating 
point  arithmetic,  this  choice  of  sign  leads  to  a  small  relative  error  when  computing 
w.  For  example,  if  p.  =  1  the  sign  of  ej  is  that  of  Re(£i).  Parlett  [7]  presents  a 
thorough  discussion  on  the  choice  of  sign  when  computing  Householder  reflectors.  For 
the  remainder  of  the  paper,  v  =  Sign(f2e(£1))||a;||. 

Note  that  an  elementary  Hermitian  (and  Householder)  matrix  chooses  w  =  (x  + 
ue\)/\\x  +  ve\\\  so  that  wH w  —  1,  7  =  —1.  Conditions  (1)  and  (4)  are  satisfied. 

2.1  The  Wilkinson  Approach 

Wilkinson  [9,  pages  49-50]  suggested  the  following  modification.  Let  £1  =  e^1  |£i  |  where 
0  <  #1  <  27 r  and 


x  =  e»'y  =  eie'[ \ti\,e~i9' e~i9'tn]T. 

Then  even  if  ^  has  a  non-zero  imaginary  part,  ejy  is  a  real  number,  an  elementary 
Hermitian  P  may  be  constructed  so  that  Py  is  areal  multiple  of  e\.  Thus,  condition  (4) 
is  satisfied  as  already  discussed.  Set  U  =  el&i  P  and 

UHx  =  (e-'9'P)(e'e'y)  =  Py  =  i\\x\\eu 

where  7  =  —  1.  The  matrix  U  is  a  multiple  of  an  elementary  unitary  matrix.  Since  the 
first  component  of  y  is  a  non-negative  number,  9\  is  zero. 

Although  EISPACK  [8]  does  not  have  a  subroutine  that  computes  an  elementary 
unitary  matrix,  the  subroutines  CORTH  and  HTRIDI  implement  a  slight  variation  of 
the  Wilkinson  approach.  CORTH  [8,  pages  300-305]  and  HTRIDI  [8,  pages  357-363] 
orthogonally  reduce  a  general  and  Hermitian  matrix  to  upper  Hessenberg  and  tridiag¬ 
onal  form,  respectively.  They  set  U  -  P  directly  and  thus  transform  y  to  - ei e'  \\x\\e\. 
The  software  sets  w  =  x  +  ezdl  ||x||e1(=  Pe'l(y  +  ||a:||ei))  and  a  =  2(wHw)~ F  Hence 
wHw  =  2||a;|[(||a:||  +  |£i|)  and  a  —  l/||a;||(|ja:||  +  |£i|)  thus  satisfying  condition  (4).  A 
simple  calculation  shows  that 

UHx  =  x  -  a(wHx)x  =  x-  cr|M|(||a;||  +  =  7||*||ci, 

where  7  =  —  et&1 .  In  order  to  prevent  possible  overflow  when  computing  a,  the  vector 
x  is  is  initially  normalized  by  6  =  |i^e(^1)|  +  |/m(^)|  H - +  \Re(£n)\  +  |/m(^n)|. 

2.2  The  LINPACK  Approach 

As  in  EISPACK,  LINPACK  does  not  have  a,  general  purpose  subroutine  implementing  the 
solution  of  problem  (1).  However,  subroutines  CQRDC  [2,  chapter  9]  and  CSVDC  [2, 
chapter  11]  employ  elementary  unitary  matrices.  Subroutines  CQRDC  and  CSVDC 
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compute  the  QR  factorization  and  singular  value  decomposition  of  a  complex  matrix, 
respectively. 

The  LINPACK  form  for  an  elementary  unitary  matrix  is  easily  derived  by  scaling 
the  w  used  by  EISPACK  with  r/  =  e~l&1  /||x||.  From  the  remarks  regarding  the  scaling 
of  equation  (4),  a  =  ||x||/(||a:||  +  |£i|)  and  the  LINPACK  U  is  such  that  UHx  =  7||a:|]e1 
where  7  =  —  elSl .  Note  that  for  non-zero  x,  .5  <  a  <  1  thus  avoiding  the  risk  of  overflow 
possible  in  the  in  the  (unsealed)  EISPACK  variant. 


2.3  The  NAG  Approach 

The  second  form  for  an  elementary  unitary  matrix  is  due  to  Hammarling  and  Du 
Croz  [6],  (Introduction  -  F06).  Unlike  the  previous  two  versions,  this  one  computes  an 
elementary  unitary  matrix  U  so  that  UHx  is  a  real  multiple  of  e\ .  As  explained  at  the 
beginning  of  §  2,  the  resulting  a  cannot  be  real  unless  £1  is  also. 

Choosing  a  =  ( xH w)~x  where  w  =  x  +  ve\  results  in  UHx  =  (I  —  awwH)x  = 
x  —  (awH x)w  =  7/^!  where  7  =  —  1.  This  choice  of  a  will  satisfy  (4)  as  we  now 
demonstrate.  First 


wH  x  —  ( xH  +  vej  )x  —  xH x  +  =  v(v  +  Ct), 

which  determines  a  and  ||w||2  =  (xH  +  ue\ )(x  +  ue\)  -  2 u(v  +  Re(^)).  Finally 

(wH  x)(xH  w)(a  +  a)  =  (wH  x)(xH  w){— \r- -  +  ~i~), 

wnx  xMw 


H  M 

=  X  W  +  W  X, 

=  +  (1)  +  +  Zi), 

=  2u(u  +  Re(^i)), 


shows  that  |cr|2(<r  +  a)  —  1 1 rc? 1 1 2  as  claimed.  Note  that  when  £1  is  real,  U  is  Hermitia.n. 
This  version  does  not  appear  to  be  as  widely  known  as  the  Wilkinson  one. 

The  NAG  subroutine  F06HRF  computes  an  elementary  unitary  matrix  so  that 


Re(\r]\  2<t)  =  1  and  1  <  e^r/w  <  \/2,  (5) 

for  some  scale  factor  rj.  First  note  that  ef  w/(£i  +  v)  =  1.  Then,  from  the  manner  in 
which  v  was  chosen,  it  follows  that  Re(<r\^  +  z/|-2)  =  (||x||  +  |i?e(^i)|)/||x||.  Hence  the 
choice  of  rj  =  \/(||x||  +  |i^e(^1)|)/||x||/(^1  +  v)  is  such  that 


fT  feliJM  and  |,|-2g  =  1ML+  sign(M6)K. 

V  11*11  M  +  |J8«(fi)l 

and  the  two  conditions  (5)  on  T)  are  satisfied.  Note  that  1  <  |r/|_2|cr|  <  2. 


2.4  The  LAPACK  approach 

The  LAPACK  subroutine  CLARFG  is  a  slight  variant  of  the  one  used  by  the  NAG 
subroutine  F06HRF.  The  resulting  code  is  an  excellent  example  of  the  art  of  developing 
software  from  a  numerical  algorithm.  Using  the  notation  of  the  previous  section  for  w 
and  <7,  let  t?-1  =  ^  +  v  and  hence  ejr/w  -  1  and  \rj\~2a  =  (£]  +  v)/v.  Conditions  (1) 


4 


Problem  Statement: 


Compute  U  —  I  —  awwH  where  UH x  =  7||a;||e1,  UHU  =  I,  and  |-y |  =  1. 


Notation: 


6  = 

&  =  ejx  for  i  =  1  :  n,  v  =  Sign(Ee(fi))|| 
ildl  |£i|  where  0  <  9\  <  ‘2-k,  k  =  (|i?e(^1)|  -f 

*11. 

Il*ll)/ll*ll 

Method 

w 

a 

7 

EISPACK 

x  +  e401  ||rc  ||e! 

1/11*11(161  +  INI) 

-e101 

LINPACK 

xe~lh /\\x\\  +  e1 

11*11/(161  + 11*11) 

_ei&l 

NAG 

(x  +  i/ea)VK/(fi  +  v) 

(6  +  v)/vK 

-1 

LAPACK 

(x  +  uel  )/(6  +  v) 

(6  +  v)/v 

-1 

Table  1:  Comparisons  for  the  four  variants  used  to  compute  an  elementary  unitary 
matrix 

and  (4)  are  satisfied  since  w  and  cr  are  scaled  here  by  rj  and  |r/|~2,  respectively.  Note 
that  1  <  |r/|_2|(r|  <  2.  If  x  is  a  real  multiple  of  t\  then  r  <—  0  and  U  —  I. 

Representing  U  for  use  in  further  computation  only  requires  storage  for  the  complex 
r.  The  storage  for  x  may  be  re-used  to  write  both  u  and  the  essential  part  of  w,  that 

is  *  <-  [^,6/(6  +  y),...,W(6  +  Z')]T 

One  who  reviews  subroutine  CLARFG  will  notice  the  programmer  took  care  not  to 
reciprocate  the  number  ||x||  that  may  fall  below  a  certain  machine  dependent  tolerance, 
SAFMIN.  The  value  SAFMIN,  computed  by  the  LAPACK  auxiliary  subroutine  SLAMCH 
is  a  machine  dependent  lower  bound  for  numbers  that  may  be  safely  reciprocated  and 
not  cause  an  overflow  condition.  If  ||z||  is  less  than  the  lower  bound  then  the  vector 
x  is  scaled  by  a  multiple  of  the  reciprocal  of  SAFMIN  until  it  is  at  least  as  large  as 
SAFMIN.  Defining  the  integer  k  to  represent  the  number  of  scalings  required,  let  6  = 
fc/SAFMIN.  The  number  er  may  now  be  safely  computed  as  a  <—  (v  +  0£ \)/v  where  v  <— 
Sign(f£e(0£i))(||0:r||).  The  essential  part  of  u  is  computed  as  (0£i  ■  ■  ■ ,  0£n]T- 

This  same  scaling  technique  is  also  used  by  the  real  precision  version  of  CLARFG — 
SLARFG. 

3  Comparisons  and  Conclusions 

Four  different  forms  of  elementary  unitary  matrices  were  presented  to  solve  the 
elimination  problem  defined  by  (1).  Table  1  presents  a  summary  of  the  four  approaches. 
We  now  briefly  analyze  the  information  in  the  table. 

•  The  EISPACK  approach.  Benefit:  Real  a.  Cost:  An  initial  scaling  of  x  to  prevent 
possible  overflow  when  computing  a  and  storing  a  possibly  complex  7. 

•  The  LINPACK  approach.  Benefit:  Real  cr;  .5  <  |cr|  <  1.  Cost:  Storing  a  possibly 
complex  7. 
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•  The  NAG  approach.  Benefit:  Directly  obtains  a  real  7.  Cost:  Storing  a  possibly 
complex  cr  and  forming  a  square  root;  1<M<2. 

•  The  LAPACK  approach.  Benefit:  Directly  obtains  a  real  7.  Cost:  Storing  a 
possibly  complex  <7;  1  <  \a\  <  2. 

Examining  the  application  of  U  to  a  matrix  A  allows  the  following  analysis: 

•  The  EISPACK  and  UNPACK  approaches  require  computing  A  —  crw(AH w)H  with 
real  cr. 

•  The  LAPACK  and  NAG  compute  A  —  crw(AHw)H  with  possibly  complex  a. 

Since  computing  the  QR  factorization  of  a  matrix, the  bidiagonal,  Hessenberg,  and 
tridiagonal  reductions,  involve  applications  of  elementary  unitary  matrices  to  A,  the 
computation  is  always  cheaper  with  real  a. 

The  benefit  of  directly  computing  a  real  7  is  that  it  allows  reuse  of  software.  For  ex¬ 
ample,  when  reducing  a  Hermitian  matrix  to  tridiagonal  form,  the  resulting  tridiagonal 
matrix  is  real,  and  the  symmetric  tridiagonal  QR  algorithm  may  then  be  employed  [1], 
The  same  may  be  said  about  the  preliminary  reduction  of  a  matrix  to  bidiagonal  form 
needed  by  the  singular  value  decomposition:  see  [1,  page  42]  and  [2,  chapter  9].  A 
third  example  is  when  computing  a  QR  factorization  of  a  matrix  A.  For  stable  compu¬ 
tation  of  a  solution  to  a  linear  least  squares  problem,  a  triangular  system  of  equations 
involving  R  is  often  required.  Directly  computing  a  real  7  results  in  real  numbers  on 
the  diagonal  of  R.  Thus  the  careful  scaling  algorithms  used  by  LAPACK  when  solving 
triangular  system  of  equations  may  be  employed. 

On  the  other  hand,  when  using  either  the  EISPACK  and  LINPACK  forms  of  elemen¬ 
tary  unitary  matrices,  a  diagonal  unitary  matrix  D  may  always  be  computed  to  allow 
reuse  of  software  or  the  use  of  careful  scaling  algorithms.  For  example,  when  computing 
a  QR  factorization  of  a  matrix  A  with  m  rows  and  n  columns,  let  D  =  Diag(tfi , . . . ,  6m) 
be  the  diagonal  matrix  where  6j  =  ej Re j /\ej Re j\  for  j  =  1  :  min (m,n)  and  Sj  =  1 
otherwise.  It  then  follows  that  A  =  QR  =  (QD)(Dn R),  QD  is  unitary  and  the  diag¬ 
onal  elements  of  DH R  are  real  numbers.  Similar  procedures  may  be  employed  when 
further  reducing  a  Hermitian  tridiagonal  matrix  to  real  symmetric  tridiagonal  form 
and  when  reducing  a  matrix  to  real  bidiagonal  form.  Further  computation  and  storage 
is  required.  The  elementary  unitary  matrices  based  on  the  Hammarling  and  Du  Croz 
approach  implicitly  perform  this  post-processing  step. 
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